Search Result

Select

Degree centrality based method for cognitive feature selection

ZHANG Xiaofei, YANG Yang, HUANG Jiajin, ZHONG Ning

Journal of Computer Applications 2021, 41 (9): 2767-2772. DOI: 10.11772/j.issn.1001-9081.2020111794

Abstract （238）

PDF （2920KB）（402）

Save

To address the uncertainty of cognitive feature selection in brain atlas, a Degree Centrality based Cognitive Feature Selection Method (DC-CFSM) was proposed. First, the Functional Brain Network (FBN) of the subjects in the cognitive experiment tasks was constructed based on the brain atlas, and the Degree Centrality (DC) of each Region Of Interest (ROI) of the FBN was calculated. Next, the difference significances of the subjects' same cortical ROI under different cognitive states during executing cognitive task were statistically compared and ranked. Finally, the Human Brain Cognitive Architecture-Area Under Curve (HBCA-AUC) values were calculated for the ranked regions of interest, and the performances of several cognitive feature selection methods were evaluated. In the experiments on functional Magnetic Resonance Imaging (fMRI) data of mental arithmetic cognitive tasks, the values of HBCA-AUC obtained by DC-CFSM on the Task Positive System (TPS), Task Negative System (TNS), and Task Support System (TSS) of the human brain cognitive architecture were 0.669 2, 0.304 0 and 0.468 5 respectively. Compared with Extremely randomized Trees (Extra Trees), Adaptive Boosting (AdaBoost), random forest, and eXtreme Gradient Boosting (XGB), the recognition rate for TPS of DC-CFSM was increased by 22.17%, 13.90%, 24.32% and 37.19% respectively, while its misrecognition rate for TNS was reduced by 20.46%, 29.70%, 44.96% and 33.39% respectively. DC-CFSM can better reflect the categories and functions of the human brain cognitive system in the selection of cognitive features of brain atlas.

Reference | Related Articles | Metrics

Select

Constrained multi-objective optimization algorithm based on coevolution

ZHANG Xiangfei, LU Yuming, ZHANG Pingsheng

Journal of Computer Applications 2021, 41 (7): 2012-2018. DOI: 10.11772/j.issn.1001-9081.2020081344

Abstract （538）

PDF （975KB）（338）

Save

In view of the problem that it is difficult for constrained multi-objective optimization algorithms to effectively balance convergence and diversity, a new constrained multi-objective optimization algorithm based on coevolution was proposed. Firstly, a population with certain number of feasible solutions was obtained by using the feasible solution search method based on steady-state evolution. Then, this population was divided into two sub-populations and both convergence and diversity were achieved by coevolution of the two sub-populations. Finally, standard constrained multi-objective optimization problems CF1~CF7, DOC1~DOC7 and the practical engineering problems were used for simulation experiments to test the solution performance of the proposed algorithm. Experimental results show that compared with Nondominated Sorting Genetic Algorithm Ⅱ based on Constrained Dominance Principle (NSGA-Ⅱ-CDP), Two-Phase algorithm (ToP), Push and Pull Search algorithm (PPS) and Two-Archive Evolutionary Algorithm for Constrained multiobjective optimization (C-TAEA), the proposed algorithm achives good results in both Inverted Generational Distance (IGD) and HyperVolume (HV), indicating that the proposed algorithm can effectively balance convergence and diversity.

Reference | Related Articles | Metrics

Select

User recommendation method of cross-platform based on knowledge graph and restart random walk

YU Dunhui, ZHANG Luyi, ZHANG Xiaoxiao, MAO Liang

Journal of Computer Applications 2021, 41 (7): 1871-1877. DOI: 10.11772/j.issn.1001-9081.2020111745

Abstract （379）

PDF （1188KB）（525）

Save

Aiming at the problems of the single result of recommending similar users and insufficient understanding of user interests and behavior information for single social network platforms, a User Recommendation method of Cross-Platform based on Knowledge graph and Restart random walk (URCP-KR) was proposed. First, in the similar subgraphs segmented and matched by the target platform graph and the auxiliary platform graph, an improved multi-layer Recurrent Neural Network (RNN) was used to predict the candidate user entities. And the similar users were selected by comprehensive use of the similarity of topological structure features and user portrait similarity. Then, the relationship information of similar users in the auxiliary platform graph was used to complete the target platform graph. Finally, the probabilities of the users in the target platform graph walking to each user in the community were calculated, so that the interest similarity between users was obtained to realize the user recommendation. Experimental results show that the proposed method has higher recommendation precision and diversity than Collaborative Filtering (CF) algorithm, User Recommendation algorithm based on Cross-Platform online social network (URCP) and User Recommendation algorithm based on Multi-developer Community (UR-MC) with the recommendation precision up to 95.31% and the recommendation coverage up to 88.42%.

Reference | Related Articles | Metrics

Select

Opinion spam detection based on hierarchical heterogeneous graph attention network

ZHANG Rong, ZHANG Xianguo

Journal of Computer Applications 2021, 41 (5): 1275-1281. DOI: 10.11772/j.issn.1001-9081.2020081190

Abstract （497）

PDF （1116KB）（650）

Save

Aiming at the problem that the non-semantic features of reviews cannot be fully utilized in opinion spam detection, a hierarchical attention mechanism and heterogeneous graph attention network based model, Hierarchical Heterogeneous Graph Attention Network (HHGAN), was proposed. Firstly, the hierarchical attention mechanism was used to learn the word-level and sentence-level document representations to focus on the capturing of the words and sentences that were important to the opinion spam detection. Then, the learned document representations were used as nodes, and the non-semantic features in reviews were selected as meta-paths to construct a heterogeneous graph attention network with a double-layer attention mechanism. Finally, a Multi-Layer Perceptron (MLP) was designed to distinguish the categories of reviews. Experimental results on datasets of restaurant and hotel extracted from yelp.com show that the F1 values of the HHGAN model reach 0.942 and 0.923 respectively, which are better than those of the traditional Convolutional Neural Network (CNN) model and other benchmark models of neural network.

Reference | Related Articles | Metrics

Select

Event description generation based on generative adversarial network

SUN Heli, SUN Yuzhu, ZHANG Xiaoyun

Journal of Computer Applications 2021, 41 (5): 1256-1261. DOI: 10.11772/j.issn.1001-9081.2020081242

Abstract （368）

PDF （971KB）（700）

Save

In Event-Based Social Networks (EBSNs), generating the event description of social events automatically is helpful for the organizer, so as to avoid the problems of poor description, descripting too much and low accuracy, and be easy to form rich, accurate and attractive event description. In order to automatically generate text that is sufficiently similar to true event description, a Generative Adversarial Network (GAN) model named GAN_PG was proposed to generate event description. In the GAN_PG model, the Variational Auto-Encoder (VAE) was used as the generator, and the neural network with the Gated Recurrent Unit (GRU) was used as the discriminator. In the model training, the Policy Gradient (PG) decline in reinforcement learning was used as reference, and a reasonable reward function was designed to train the generator to generate event description. Experimental results showed that the BLEU-4 value of the event description generated by GAN_PG reached 0.67, which proved that the event description generation model GAN_PG can generate event descriptions sufficiently similar to natural language in an unsupervised way.

Reference | Related Articles | Metrics

Select

AAC-Hunter: efficient algorithm for discovering aggregation algebraic constraints in relational databases

ZHANG Xiaowei, JIANG Dawei, CHEN Ke, CHEN Gang

Journal of Computer Applications 2021, 41 (3): 636-642. DOI: 10.11772/j.issn.1001-9081.2020091473

Abstract （326）

PDF （1077KB）（584）

Save

In order to better maintain the data integrity and help auditors find anomalous reimbursement records in relational databases, the algorithm AAC-Hunter (Aggregation Algebraic Constraints Hunter), which discovered Aggregation Algebraic Constraints (AACs) automatically, was proposed. An AAC is a fuzzy constraint defined between the aggregation results of two columns in the database and acts on most but not all records. Firstly, joining, grouping and algebraic expressions were enumerated to generate candidate AACs. Secondly, the value range sets of these candidate AACs were calculated. Finally, the AAC results were output. However, this method was not able to face the performance challenges caused by massive data, so that a set of heuristic rules were applied to decrease the size of candidate constraint space and the optimization strategies based on intermediate results reuse and trivial candidate AACs elimination were employed to speed up the value range set calculation for candidate AACs. Experimental results on TPC-H and European Soccer datasets show that AAC-Hunter reduces the constraint discovery space by 95.68% and 99.94% respectively, and shortens running time by 96.58% and 92.51% respectively, compared with the baseline algorithm without heuristic rules or optimization strategies. As the effectiveness of AAC-Hunter is verified, it can be seen that AAC-Hunter can improve the efficiency and capability of auditing application.

Reference | Related Articles | Metrics

Select

Respiratory sound recognition of chronic obstructive pulmonary disease patients based on HHT-MFCC and short-term energy

CHANG Zheng, LUO Ping, YANG Bo, ZHANG Xiaoxiao

Journal of Computer Applications 2021, 41 (2): 598-603. DOI: 10.11772/j.issn.1001-9081.2020060881

Abstract （399）

PDF （1298KB）（663）

Save

In order to optimize the Mel-Frequency Cepstral Coefficient (MFCC) feature extraction algorithm, improve the recognition accuracy of respiratory sound signals, and achieve the purpose of identifying Chronic Obstructive Pulmonary Disease (COPD), a feature extraction algorithm with the fusion of MFCC based on Hilbert-Huang Transform (HHT) and short-term Energy, named HHT-MFCC+Energy, was proposed. Firstly, the preprocessed respiratory sound signal was used to calculate the Hilbert marginal spectrum and marginal spectrum energy through HHT. Secondly, the spectral energy was passed through the Mel filter to obtain the eigenvector, and then the logarithm and discrete cosine transform of the eigenvector were performed to obtain the HHT-MFCC coefficients. Finally, the short-term energy of signal was fused with the HHT-MFCC eigenvector to form a new feature, and the signal was identified by Support Vector Machine (SVM). Three feature extraction algorithms including MFCC, HHT-MFCC and HHT-MFCC+Energy were combined with SVM to recognize the respiratory sound signal. Experimental results show that the proposed feature fusion algorithm has better respiratory sound recognition effect for both COPD patients and healthy people compared with the other two algorithms:the average recognition rate of the proposed algorithm can reach 97.8% on average when extracting 24-dimensional features and selecting 100 training samples, which is 6.9 percentage points and 1.4 percentage points higher than those of MFCC and HHT-MFCC respectively.

Reference | Related Articles | Metrics

Select

Semantic segmentation method based on edge attention model

SHE Yulong, ZHANG Xiaolong, CHENG Ruoqin, DENG Chunhua

Journal of Computer Applications 2021, 41 (2): 343-349. DOI: 10.11772/j.issn.1001-9081.2020050725

Abstract （481）

PDF （1372KB）（636）

Save

Liver is the main organ of human metabolic function. At present, the main problems of machine learning in the semantic segmentation of liver images are as follows:1) there are inferior vena cava, soft tissue and blood vessels in the middle of the liver, and even some necrosis or hepatic fissures; 2) the boundary between the liver and some adjacent organs is blurred and difficult to distinguish. In order to solve the problems mentioned above, the Edge Attention Model (EAM)and the Edge Attention Net (EANet) were proposed by using Encoder-Decoder framework. In the encoder, the residual network ResNet34 pre-trained on ImageNet and the EAM were utilized, so as to fully obtain the detailed feature information of liver edge; in the decoder, the deconvolution operation and the proposed EAM were used to perform the feature extraction to the useful information, thereby obtaining the semantic segmentation diagram of liver image. Finally, the smoothing was performed to the segmentation images with a lot of noise. Comparison experiments with AHCNet were conducted on three datasets, and the results showed that:on 3Dircadb dataset, the Volumetric Overlap Error (VOE) and Relative Volume Difference (RVD) of EANet were decreased by 1.95 percentage points and 0.11 percentage points respectively, and the DICE accuracy was increased by 1.58 percentage points; on Sliver07 dataset, the VOE, Maximum Surface Distance (MSD) and Root Mean Square Surface Distance (RMSD) of EANet were decreased approximately by 1 percentage points, 3.3 mm and 0.2 mm respectively; on clinical MRI liver image dataset of a hospital, the VOE and RVD of EANet were decreased by 0.88 percentage points and 0.31 percentage points respectively, and the DICE accuracy was increased by 1.48 percentage points. Experimental results indicate that the proposed EANet has good segmentation effect of liver image.

Reference | Related Articles | Metrics

Select

Social recommendation based on dynamic integration of social information

REN Kezhou, PENG Furong, GUO Xin, WANG Zhe, ZHANG Xiaojing

Journal of Computer Applications 2021, 41 (10): 2806-2812. DOI: 10.11772/j.issn.1001-9081.2020111892

Abstract （351）

PDF （728KB）（401）

Save

Aiming at the problem of data sparseness in recommendation algorithms, social data are usually introduced as auxiliary information for social recommendation. The traditional social recommendation algorithms ignore users' interest transfer, which makes the model unable to describe the dynamic characteristics of user interests, and the algorithms also ignore the dynamic characteristics of social influences, which causes the model to treat long before social behaviors and recent social behaviors equally. Aiming at these two problems, a social recommendation model named SLSRec with dynamic integration of social information was proposed. First, self-attention mechanism was used to construct a sequence model of user interaction items to implement the dynamic description of user interests. Then, an attention mechanism with forgetting with time was designed to model the short-term social interests, and an attention mechanism with collaborative characteristics was designed to model long-term social interests. Finally, the long-term and short-term social interests and the user's short-term interests were combined to obtain the user's final interests and generate the next recommendation. Normalized Discounted Cumulative Gain (NDCG) and Hit Rate (HR) indicators were used to compare and verify the proposed model, the sequence recommendation models (Self-Attention Sequence Recommendation (SASRec) model) and the social recommendation model (neural influence Diffusion Network for social recommendation (DiffNet) model) on the sparse dataset brightkite and the dense dataset Last.FM. Experimental results show that compared with DiffNet model, SLSRec model has the HR index increased by 8.5% on the sparse dataset; compared with SASRec model, SLSRec model has the NDCG index increased by 2.1% on the dense dataset, indicating that considering the dynamic characteristics of social information makes the recommendation results more accurate.

Reference | Related Articles | Metrics

Select

Short-term traffic flow prediction based on empirical mode decomposition and long short-term memory neural network

ZHANG Xiaohan, FENG Aimin

Journal of Computer Applications 2021, 41 (1): 225-230. DOI: 10.11772/j.issn.1001-9081.2020060919

Abstract （496）

PDF （1687KB）（541）

Save

Traffic flow prediction is an important part of intelligent transportation. The traffic data to be processed by it are non-linear, periodic, and random, as a result, the unstable traffic flow data depend on long-term data range during data prediction. At the same time, due to some external factors, the original data often contain some noise, which may further lead to the degradation of prediction performance. Aiming at the above problems, a prediction algorithm named EMD-LSTM that can denoise and process long-term dependence was proposed. Firstly, Empirical Mode Decomposition (EMD) was used to decompose different scale components in the traffic time series data gradually to generate a series of intrinsic mode functions with the same feature scale, thereby removing certain noise influence. Then, with the help of Long Short-Term Memory (LSTM) neural network, the problem of long-term dependence of data was solved, so that the algorithm performed more outstanding in long-term field prediction. Experimental results of short-term prediction of actual datasets show that EMD-LSTM has the Mean Absolute Error (MAE) 1.916 32 lower than LSTM, and the Mean Absolute Percentage Error (MAPE) 4.645 45 percentage points lower than LSTM. It can be seen that the proposed hybrid model significantly improves the prediction accuracy and can solve the problem of traffic data effectively.

Reference | Related Articles | Metrics

Select

Log analysis and workload characteristic extraction in distributed storage system

GOU Zi'an, ZHANG Xiao, WU Dongnan, WANG Yanqiu

Journal of Computer Applications 2020, 40 (9): 2586-2593. DOI: 10.11772/j.issn.1001-9081.2020010121

Abstract （406）

PDF （1136KB）（668）

Save

Analysis of the workload running on the file system is helpful to optimize the performance of the distributed file system and is crucial to the construction of new storage system. Due to the complexity of workload and the increase of scale diversity, it is incomplete to explicitly capture the characteristics of workload traces by intuition-based analysis. To solve this problem, a distributed log analysis and workload characteristic extraction model was proposed. First, reading and writing related information was extracted from distributed file system logs according to the keywords. Second, the workload characteristics were described from two aspects: statistics and timing. Finally, the possibility of system optimization based on workload characteristics was analyzed. Experimental results show that the proposed model has certain feasibility and accuracy, and can give workload statistics and timing characteristics in detail. It has the advantages of low overhead, high timeliness and being easy to analyze, and can be used to guide the synthesis of workloads with the same characteristics, hot spot data monitoring, and cache prefetching optimization of the system.

Reference | Related Articles | Metrics

Select

Multi-branch neural network model based weakly supervised fine-grained image classification method

BIAN Xiaoyong, JIANG Peiling, ZHAO Min, DING Sheng, ZHANG Xiaolong

Journal of Computer Applications 2020, 40 (5): 1295-1300. DOI: 10.11772/j.issn.1001-9081.2019111883

Abstract （481）

PDF （751KB）（563）

Save

Concerning the problem that the local feature and rotation invariant feature cannot be jointly paid attention to in traditional attention-based neural networks, a multi-branch neural network model based weakly supervised fine-grained image classification method was proposed. Firstly, the lightweight Class Activation Map (CAM) network was utilized to localize the local region with potential semantic information, and the residual network ResNet-50 with deformable convolution and Oriented Response Network (ORN) with rotation invariant coding were designed. Secondly, the pre-trained model was employed to initialize the feature networks respectively, and the original image and the above regions were input to fine-tune the model. Finally, the three intra-branch losses and between-branch losses were combined to optimize the entire network, and the classification and prediction were performed on the test set. The proposed method achieves the classification accuracies of 87.7% and 90.8% on CUB-200-2011 dataset and FGVC_Aircraft dataset respectively, which are increased by 1.2 percentage points, and 0.9 percentage points respectively compared with those of the Multi-Attention Convolutional Neural Network (MA-CNN) method. On Aircraft_2 dataset, the proposed method reaches 91.8% classification accuracy, which is 4.1 percentage points higher than that of ResNet-50. The experimental results show that the proposed method improves the accuracy of weakly supervised fine-grained image classification effectively.

Reference | Related Articles | Metrics

Select

Image inpainting based on dilated convolution

FENG Lang, ZHANG Ling, ZHANG Xiaolong

Journal of Computer Applications 2020, 40 (3): 825-831. DOI: 10.11772/j.issn.1001-9081.2019081471

Abstract （468）

PDF （1069KB）（407）

Save

Although the existing image inpainting methods can recover the content of the missing area of the image, there are still some problems, such as structure distortion, texture blurring and content discontinuity, so that the inpainted images cannot meet people’s visual requirements. To solve these problems, an image inpainting method based on dilated convolution was proposed. By introducing the idea of dilated convolution to increase the receptive field, the quality of image inpainting was improved. This method was based on the idea of Generative Adversarial Network (GAN), which was divided into generative network and adversarial network. The generative network included global content inpainting network and local detail inpainting network, and gated convolution was used to realize the dynamical learning of the image features, solving the problem that the traditional convolution neural network method was not able to complete the large irregular missing areas well. Firstly, the global content inpainting network was used to obtain an initial content completion result, and then the local texture details were repaired by the local detail inpainting network. The adversarial network was composed of SN-PatchGAN discriminator, and was used to evaluate the image inpainting effect. Experimental results show that compared with the current image inpainting methods, the proposed method has great improvement in Peak Signal-to-Noise Ratio (PSNR), Structural SIMilarity (SSIM) and inception score. Moreover, the method effectively solves the problem of texture blurring in traditional inpainting methods, and meets people’s visual requirements better, verifying the validity and feasibility of the proposed method.

Reference | Related Articles | Metrics

Select

Performance optimization of distributed file system based on new type storage devices

DONG Cong, ZHANG Xiao, CHENG Wendi, SHI Jia

Journal of Computer Applications 2020, 40 (12): 3594-3603. DOI: 10.11772/j.issn.1001-9081.2020050632

Abstract （415）

PDF （1323KB）（510）

Save

The I/O performance of new type storage devices is usually an order of magnitude higher than that of traditional Solid State Disk (SSD). However, simply replacing SSD with new type storage device will not significantly improve the performance of distributed file system. This means that the current distributed file system cannot give full play to the performance of new type storage devices. To solve the problem, the data writing process and transmission process of Hadoop Distributed File System (HDFS) were analyzed quantitatively. Through quantitative analysis of the time consumptions of different stages of HDFS writing process, the most time-consuming data transmission between nodes was found in each stage of writing data. Therefore, the corresponding optimization strategy was proposed, that is, the processes of data transmission and processing were parallelized by using asynchronous write. So that the processing stages of different data packets were parallel to each other, shortening the total processing time of data writing, thereby the write performance of HDFS was improved. Experimental results show the proposed scheme improves the HDFS write throughput by 15%-24%, and reduces the overall write execution time by 28%-36%.

Reference | Related Articles | Metrics

Select

Chinese short text classification model with multi-head self-attention mechanism

ZHANG Xiaochuan, DAI Xuyao, LIU Lu, FENG Tianshuo

Journal of Computer Applications 2020, 40 (12): 3485-3489. DOI: 10.11772/j.issn.1001-9081.2020060914

Abstract （610）

PDF （806KB）（765）

Save

Aiming at the problem that the semantic ambiguity caused by the lack of context information in Chinese short texts results in feature sparsity, a text classification model combing Convolutional Neural Network and Multi-Head self-Attention mechanism (CNN-MHA) was proposed. Firstly, the existing Bidirectional Encoder Representations from Transformers (BERT) pre-training language model was used to format the sentence-level short texts in the form of character-level vectors. Secondly, in order to reduce the noise, the Multi-Head self-Attention mechanism (MHA) was used to learn the word dependence inside the text sequence and generate the hidden layer vector with global semantic information. Then, the hidden layer vector was input into the Convolutional Neural Network (CNN) to generate the text classification feature vector. In order to improve the optimization effect of classification, the output of convolutional layer was fused with the sentence features extracted by BERT model, and then inputted to the classifier for re-classification. Finally, the CNN-MHA model was compared with TextCNN model, BERT model and TextRCNN model respectively. Experimental results show that, the F1 performance of the improved model is increased by 3.99%, 0.76% and 2.89% respectively compared to those of the comparison models on SogouCS dataset, which proves the effectiveness of the improved model.

Reference | Related Articles | Metrics

Select

Social event participation prediction based on event description

SUN Heli, SUN Yuzhu, ZHANG Xiaoyun

Journal of Computer Applications 2020, 40 (11): 3101-3106. DOI: 10.11772/j.issn.1001-9081.2020030418

Abstract （433）

PDF （676KB）（624）

Save

In the related research of Event Based Social Networks (EBSNs), it is difficult to predict the participation of social events based on event description. The related studies are very limited, and the research difficulty mainly comes from the evaluation subjectivity of event description and limitations of language modeling algorithms. To solve these problems, first the concepts of successful event, similar event and event similarity were defined. Based on these concepts, the social data collected from the Meetup platform was extracted. At the same time, the analysis and prediction methods based on Lasso regression, Convolutional Neural Network (CNN) and Gated Recurrent Neural Network (GRNN) were separately designed. In the experiment, part of the extracted data was selected to train the three models, and the remaining data was used for the analysis and prediction. The results showed that, compared with the events without event description, the prediction accuracy of the events processed by the Lasso regression model was improved by 2.35% to 3.8% in different classifiers, and the prediction accuracy of the events processed by the GRNN model was improved by 4.5% to 8.9%, and the result of the CNN model processing was not ideal. This study proves that event description can improve event participation, and the GRNN model has the highest prediction accuracy among the three models.

Reference | Related Articles | Metrics

Select

Early diagnosis and prediction of Parkinson's disease based on clustering medical text data

ZHANG Xiaobo, YANG Yan, LI Tianrui, LU Fan, PENG Lilan

Journal of Computer Applications 2020, 40 (10): 3088-3094. DOI: 10.11772/j.issn.1001-9081.2020030359

Abstract （413）

PDF （1270KB）（827）

Save

In view of the problem of the early intelligent diagnosis for Parkinson's Disease (PD) which occurs more common in the elderly, the clustering technologies based on medical detection text information data were proposed for the analysis and prediction of PD. Firstly, the original dataset was pre-processed to obtain effective feature information, and these features were respectively reduced to eight dimensional spaces with different dimensions by Principal Component Analysis (PCA) method. Then, five traditional classical clustering models and three different clustering ensemble methods were respectively used to cluster the data of eight dimensional spaces. Finally, four clustering performance indexes were selected to predict PD subject with dopamine deficiency as well as healthy control and Scans Without Evidence of Dopamine Deficiency (SWEDD) PD subject. The simulation results show that the clustering accuracy of Gaussian Mixture Model (GMM) reaches 89.12% when the value of PCA feature dimension is 30, the clustering accuracy of Spectral Clustering (SC) is 61.41% when the PCA feature dimension value is 70, and the clustering accuracy of Meta-CLustering Algorithm (MCLA) achieves 59.62% when the PCA feature dimension value is 80. The comparative experiments results show that GMM has the best clustering effect in the five classical clustering methods when the PCA feature dimension value is less than 40 and MCLA has the excellent clustering performance among the three clustering ensemble methods for different feature dimensions, which thereby provides the technical and theoretical supports for the early intelligent auxiliary diagnosis of PD.

Reference | Related Articles | Metrics

Select

Influence maximization algorithm based on reverse PageRank

ZHANG Xianli, TANG Jianxin, CAO Laicheng

Journal of Computer Applications 2020, 40 (1): 96-102. DOI: 10.11772/j.issn.1001-9081.2019061066

Abstract （444）

PDF （1052KB）（340）

Save

Concerning the problem that the existing influence maximization algorithms on social networks are difficult to meet the requirements of propagation range, time cost and memory usage on large scale networks simultaneously, a heuristic algorithm of Mixed PageRank and Degree (MPRD) was proposed. Firstly, the idea of reverse PageRank was introduced for evaluating the influence of nodes based on PageRank. Secondly, a mixed index based on reverse PageRank and degree centrality was designed for evaluating final influence of nodes. Finally, the seed node set was selected by using the similarity method to filter out the node with serious overlapping influence. The experiments were conducted on six datasets and two propagation models. The experimental results show that the proposed MPRD is superior to the existing heuristic algorithms in term of propagation range, and is four or five orders of magnitude faster than greedy algorithm, and needs lower memory compared to Influence Maximization based on Martingale (IMM) algorithm based on reverse sampling. The proposed MPRD can achieve the balance of propagation range, time cost and memory usage in solving the problem of influence maximization on large scale networks.

Reference | Related Articles | Metrics

Select

Application of KNN algorithm based on value difference metric and clustering optimization in bank customer behavior prediction

LI Bo, ZHANG Xiao, YAN Jingyi, LI Kewei, LI Heng, LING Yulong, ZHANG Yong

Journal of Computer Applications 2019, 39 (9): 2784-2788. DOI: 10.11772/j.issn.1001-9081.2019030571

Abstract （463）

PDF （806KB）（453）

Save

In order to improve the accuracy of loan financial customer behavior prediction, aiming at the incomplete problem of dealing with non-numerical factors in data analysis of traditional K-Nearest Neighbors (KNN) algorithm, an improved KNN algorithm based on Value Difference Metric (VDM) distance and iterative optimization of clustering results was proposed. Firstly the collected data were clustered by KNN algorithm based on VDM distance, then the clustering results were analyzed iteratively, finally the prediction accuracy was improved through joint training. Based on the customer data collected by Portuguese retail banks from 2008 to 2013, it can be seen that compared with traditional KNN algorithm, FCD-KNN (Feature Correlation Difference KNN) algorithm, Gauss Naive Bayes algorithm, Gradient Boosting algorithm, the improved KNN algorithm has better performance and stability, and has great application value in the customer behavior prediction from bank data.

Reference | Related Articles | Metrics

Select

Online task scheduling algorithm for big data analytics based on cumulative running work

LI Yefei, XU Chao, XU Daoqiang, ZOU Yunfeng, ZHANG Xiaoda, QIAN Zhuzhong

Journal of Computer Applications 2019, 39 (8): 2431-2437. DOI: 10.11772/j.issn.1001-9081.2019010073

Abstract （390）

PDF （1056KB）（248）

Save

A Cumulative Running Work (CRW) based task scheduler CRWScheduler was proposed to effectively process tasks without any prior knowledge for big data analytics platform like Hadoop and Spark. The running job was moved from a low-weight queue to a high-weight one based on CRW. When resources were allocated to a job, both the queue of the job and the instantaneous resource utilization of the job were considered, significantly improving the overall system performance without prior knowledge. The prototype of CRWScheduler was implemented based on Apache Hadoop YARN. Experimental results on 28-node benchmark testing cluster show that CRWScheduler reduces average Job Flow Time (JFT) by 21% and decreases JFT of 95th percentile by up to 35% compared with YARN fair scheduler. Further improvements can be obtained when CRWScheduler cooperates with task-level schedulers.

Reference | Related Articles | Metrics

Select

Pneumonia image recognition model based on deep neural network

HE Xinyu, ZHANG Xiaolong

Journal of Computer Applications 2019, 39 (6): 1680-1684. DOI: 10.11772/j.issn.1001-9081.2018102112

Abstract （479）

PDF （809KB）（390）

Save

Current recognition algorithm of pneumonia image faces two problems. First, the extracted features can not fit the pneumonia image well because the transfer learning model used by the pneumonia feature extractor has large image difference between the source dataset and the pneumonia dataset. Second, the softmax classifier used by the algorithm can not well process high-dimensional features, and there is still room for improvement in recognition accuracy. Aiming at these two problems, a pneumonia image recognition algorithm based on Deep Convolution Neural Network (DCNN) was proposed. Firstly, the GoogLeNet Inception V3 network model trained by ImageNet dataset was used to extract the features. Then, a feature fusion layer was added and random forest classifier was used to classify and forecast. Experiments were implemented on Chest X-Ray Images pneumonia standard dataset. The experimental results show that the recognition accuracy, sensitivity and specificity of the proposed model reach 96.77%, 97.56% and 94.26% respectively. The proposed model is 1.26 percentage points and 1.46 percentage points higher than the classic GoogLeNet Inception V3+Data Augmentation (GIV+DA) algorithm in the index of recognition accuracy and sensitivity, and is close to the optimal result of GIV+DA in the index of specificity.

Reference | Related Articles | Metrics

Select

Provable radio frequency identification authentication protocol with scalability

SHI Zhicai, WANG Yihan, ZHANG Xiaomei, CHEN Shanshan, CHEN Jiwei

Journal of Computer Applications 2019, 39 (3): 774-778. DOI: 10.11772/j.issn.1001-9081.2018081648

Abstract （420）

PDF （817KB）（267）

Save

The popular Radio Frequency IDentification (RFID) tags are some passive ones and they only have very limited computing and memory resources, which makes it difficult to solve the security, privacy and scalability problems of RFID authentication protocols. Based on Hash function, a security-provable lightweight authentication protocol was proposed. The protocol ensures the confidentiality and privacy of the sessions during the authentication process by Hashing and randomizing. Firstly, the identity of a tag was confirmed by its pseudonym and was preserved from leaking to any untrusted entity such as a reader. Secondly, only one Hashing computation was needed to confirm a tag's identity in the backend server, and the searching time to the tag's identity was limited to a constant by using the identifier to construct a Hash table. Finally, after each authentication, the secrecy and pseudonym of the tag were updated to ensure forward security of the protocol. It is proved that the proposed protocol satisfies scalability, forward security and anonymity demands and can prevent eavesdropping, tracing attack, replay attack and de-synchronization attack. The protocol only needs Hash function and pseudorandom generating operation for the tag, therefore it is very suitable to low-cost RFID systems.

Reference | Related Articles | Metrics

Select

Security verification method of safety critical software based on system theoretic process analysis

WANG Peng, WU Kang, YAN Fang, WANG Kenian, ZHANG Xiaochen

Journal of Computer Applications 2019, 39 (11): 3298-3303. DOI: 10.11772/j.issn.1001-9081.2019040688

Abstract （469）

PDF （969KB）（267）

Save

Functional implementation of modern safety critical systems is increasingly dependent on software. As a result, software security is very important to system security, and the complexity of software makes it difficult to capture the dangers of component interactions by traditional security analysis methods. In order to ensure the security of safety critical systems, a software security verification method based on System Theoretic Process Analysis (STPA) was proposed. On the basis of the security control structure, by constructing the process model with software process model variables, the system context information of dangerous behavior occurrence was specified and analyzed, and the software security requirements were generated. Then, through the landing gear control system software design, the software security verification was carried out by the model checking technology. The results show that the proposed method can effectively identify the potential dangerous control paths in the software at the system level and reduce the dependence on manual analysis.

Reference | Related Articles | Metrics

Select

Classification method and updating mechanism of hierarchical 3D indoor map

FENG Guangsheng, ZHANG Xiaoxue, WANG Huiqiang, LI Bingyang, YUAN Quan, CHEN Shijun, CHEN Dawei

Journal of Computer Applications 2019, 39 (1): 78-81. DOI: 10.11772/j.issn.1001-9081.2018071657

Abstract （364）

PDF （713KB）（238）

Save

For the fact that existing map updating methods are not good at map updating in indoor map environments, a hierarchical indoor map updating method was proposed. Firstly, the activity of indoor objects was taken as a parameter. Then, the division of hierarchy was performed to reduce the amount of updated data. Finally, a Convolutional Neural Network (CNN) was used to determine the attribution level of indoor data in network. The experimental results show that compared with the version update method, the update time of the proposed method is reduced by 27 percentage points, and the update time is gradually reduced compared with the incremental update method after the update item number is greater than 100. Compared with the incremental update method, the update package size of the proposed method is reduced by 6.2 percentage points, and its update package is always smaller than that of the version update method before the data item number is less than 200. Therefore, the proposed method can significantly improve the updating efficiency of indoor maps.

Reference | Related Articles | Metrics

Select

New post quantum authenticated key exchange protocol based on ring learning with errors problem

LI Zichen, XIE Ting, CAI Juliang, ZHANG Xiaowei

Journal of Computer Applications 2018, 38 (8): 2243-2248. DOI: 10.11772/j.issn.1001-9081.2018020387

Abstract （477）

PDF （1082KB）（336）

Save

In view of the fact that the rapid development of quantum computer technology poses serious threat to the security of the traditional public-key cryptosystem, a new authenticated key exchange protocol scheme based on Ring Learning With Errors (RLWE) problem was proposed. By using Peikert error reconciliation mechanism, both parties of communication can directly obtain the shared bit value of the uniform distribution and get the same session key. The encoding bases of lattice was used to analyze the error tolerance, and reasonable parameters were selected to ensure that both parties can get the same session key with significant probability. The security of the protocol was proved in the BR (Bellare-Rogaway) model with weak perfect forward secrecy. The security of the protocol was attributed to the difficult RLWE problem of lattice, so that the protocol can resist quantum attacks. Compared with the existing authenticated key exchange protocols based on RLWE, the size of the parameter value modulus decreases from sub-exponential to polynomial magnitude, thus the corresponding amount of computation and communication are also significantly reduced. The results show that the proposed scheme is a more concise and efficient post quantum authenticated key exchange protocol.

Reference | Related Articles | Metrics

Select

Task performance collection and classification method in cloud platforms

LIU Chunyi, ZHANG Xiao, QIN Yuansong, LU Shangqi

Journal of Computer Applications 2018, 38 (6): 1665-1669. DOI: 10.11772/j.issn.1001-9081.2017102790

Abstract （397）

PDF （797KB）（357）

Save

It is difficult for users to determine the type of cloud hosts on cloud platforms when they are actually using cloud platforms, which results in low utilization of cloud platform resources. In some typical methods to solve the low resource utilization, the placement algorithms are optimized from the perspective of cloud provider, and the user selection will limit the utilization of resources; while in other methods, the collection and prediction of task performance under the cloud platform in a short time are made, but it will reduce the accuracy of task classification. In order to achieve the goals of improving cloud platform resource utilization and simplifying user operations, a multi-attribute task performance collection tool, named Lbenchmark, was proposed to collect the performance characteristics of task comprehensively, and the load was reduced by more than 50% compared with Ganglia. Then, with the performance data, a K-Nearest Neighbor ( KNN) application performance classification algorithm with the multiple K-Dimension ( KD) tree based on configurable weights was proposed. The suitable parameters were selected to establish multiple KNN classifiers with KD tree, and the cross validation method was used to adjust the weight of each attribute in different classifiers. The experimental results show that, compared with the traditional KNN algorithm, the calculation of the proposed algorithm is significantly increased by about 10 times, and its accuracy is averagely improved by about 10%. The proposed algorithm can use data feature mapping to provide resource recommendations to users and cloud providers, improving the overall utilization of cloud platforms.

Reference | Related Articles | Metrics

Select

Computation offloading scheme based on time switch policy for energy harvesting in device-to-device communication

DONG Xinsong, ZHENG Jianchao, CAI Yueming, YIN Tinghui, ZHANG Xiaoyi

Journal of Computer Applications 2018, 38 (12): 3535-3540. DOI: 10.11772/j.issn.1001-9081.2018051171

Abstract （308）

PDF （943KB）（251）

Save

In order to improve the effectiveness of mobile cloud computing in Device-to-Device (D2D) communication network, a computation offloading scheme based on the time switch policy for energy harvesting was proposed. Firstly, the computational tasks needed to be migrated of a traffic-limited smart mobile terminal were sent to an energy-limited smart mobile terminal in the form of Radio-Frequency (RF) signals through D2D communication, and the time switch policy was used by the energy-limited smart mobile terminal for the energy harvesting of received signals. Then, the extra traffic consumption was paid by the energy-limited terminal for the relay tasks of traffic-limited terminal to the cloud server. Finally, the proposed scheme was modeled as a non-convex optimization problem for minimizing terminal energy and traffic consumption, and the optimal scheme was obtained by optimizing the time switch factor and the harvest energy allocation factor of the energy-limited terminal, and the transmission power of the traffic-limited terminal. The simulation results show that, compared with non-cooperative scheme, the proposed scheme can effectively reduce the terminal's limited resource overhead by the computation offloading through reciprocal cooperation.

Reference | Related Articles | Metrics

Select

Coupling similarity-based approach for categorizing spatial database query results

BI Chongchun, MENG Xiangfu, ZHANG Xiaoyan, TANG Yanhuan, TANG Xiaoliang, LIANG Haibo

Journal of Computer Applications 2018, 38 (1): 152-158. DOI: 10.11772/j.issn.1001-9081.2017051219

Abstract （451）

PDF （1316KB）（388）

Save

A common spatial query often leads to the problem of multiple query results because a spatial database usually contains large size of data. To deal with this problem, a new categorization approach for spatial database query results was proposed. The solution consists of two steps. In the offline step, the coupling relationship between spatial objects was evaluated by considering the location proximity and semantic similarity between them, and then a set of clusters over the spatial objects could be generated by using probability density-based clustering method, where each cluster represented one type of user requirements. In the online query step, for a given spatial query, a category tree for the user was dynamically generated by using the modified C4.5 decision tree algorithm over the clusters, so that the user could easily select the subset of query results matching his/her needs by exploring the labels assigned on intermediate nodes of the tree. The experimental results demonstrate that the proposed spatial object clustering method can efficiently capture both the semantic and location relationships between spatial objects. The query result categorization algorithm has good effectiveness and low search cost.

Reference | Related Articles | Metrics

Select

Text image restoration algorithm based on sparse coding and ridge regression

WANG Zhiyi, BI Duyan, XIONG Lei, FAN Zunlin, ZHANG Xiaoyu

Journal of Computer Applications 2017, 37 (9): 2648-2651. DOI: 10.11772/j.issn.1001-9081.2017.09.2648

Abstract （586）

PDF （690KB）（642）

Save

To solve the problem that sparse coding in text image restoration has the shortcomings of limited expression of dictionary atoms and high computation complexity, a novel text image restoration algorithm was proposed based on sparse coding and ridge regression. Firstly, patches were used to train the dictionary for sparse representation at training stage and the sampled image were clustered based on the Euclidean distances between the sampled image patches and the dictionary atoms. Then, the ridge regressors between low-quality text image patches and clear text image patches were constructed in local manifold space to achieve the local multi-linear expansion of dictionary atoms and fast calculation. At last, the clear text image patches were directly calculated at testing stage by searching for the most similar dictionary atoms with low-quality text image patches without calculating the sparse coding of low-quality text image patches. The experimental results show that compared with the existing sparse coding algorithm, the proposed algorithm has improved Peak Signal-to-Noise Ratio (PSNR) by 0.3 to 1.1 dB and reduced computing time at one or two orders of magnitude. Therefore, this method provides a good and fast solution for text image restoration.

Reference | Related Articles | Metrics

Select

Optimizing multi-slice real-time interactive visualization for out-of-core seismic volume

JI Lianen, ZHANG Xiaolin, LIANG Shiyi, WANG Bin

Journal of Computer Applications 2017, 37 (9): 2621-2625. DOI: 10.11772/j.issn.1001-9081.2017.09.2621

Abstract （422）

PDF （955KB）（434）

Save

During multi-slice interactive visualization for out-of-core seismic volume on common computing platform, traditional cache scheduling approaches, which take no account of the spacial relationship between blocks and slices, lead to the low cache hit rate while interacting. And it's also difficult to achieve high rendering quality by use of common multi-resolution rendering methods. In view of these problems, a cache scheduling strategy named Maximum Distance First Out (MDFO) was designed. Firstly, according to the spatial position of the interactive slice, the scheduling priority of the block in the cache was improved, which ensures that the candidate block has a higher hit rate when the slice interacts continuously. Then, a two-stage slice interaction method was proposed. By using the fixed resolution body block to ensure the real-time interaction, the final display quality was improved by the step-by-step refinement, and the information entropy of the block data was further combined to enhance the resolution of the user's region of interest. The experimental results show that the proposed method can effectively improve the overall hit rate of the body block and reach the proportion of more than 60%. Meanwhile, the two-stage strategy can achieve higher quality images for application-oriented requirements and resolve the contradiction between interaction efficiency and rendering quality for out-of-core seismic data visualization.

Reference | Related Articles | Metrics